10-701 Machine Learning Final Project Report: Video Summarization via Deep Convolutional Networks
نویسندگان
چکیده
As the demand of video summarization techniques increases nowadays, many methods are proposed for how to extract best representating key frames of a video. While most of them rely on hand-crafted image features, we resort to the feature learning power of deep convolutional networks. In this final project, we propose to learn a new image representation such that the similarity of frames are specifically learned for video summarization task, directly supervised by humans key frame selection. To realize this idea, we propose and implement a loss function for deep network in Caffe. We also comprehensively studied baseline methods and discuss our qualitative result and properties with them.
منابع مشابه
545 Machine Learning , Fall 2011 Final Project
This project aims at applying neural network-based deep learning to the problem of extractive text summarization. Our work is inspired by the work of Collobert and Weston [Collobert et al., 2011], who created a unified deep learning architecture to learn several common NLP tasks. In this report, we give the motivation behind our work, describe our problem formulation and present some results.
متن کاملA hybrid EEG-based emotion recognition approach using Wavelet Convolutional Neural Networks (WCNN) and support vector machine
Nowadays, deep learning and convolutional neural networks (CNNs) have become widespread tools in many biomedical engineering studies. CNN is an end-to-end tool which makes processing procedure integrated, but in some situations, this processing tool requires to be fused with machine learning methods to be more accurate. In this paper, a hybrid approach based on deep features extracted from Wave...
متن کاملHand Gesture Recognition from RGB-D Data using 2D and 3D Convolutional Neural Networks: a comparative study
Despite considerable enhances in recognizing hand gestures from still images, there are still many challenges in the classification of hand gestures in videos. The latter comes with more challenges, including higher computational complexity and arduous task of representing temporal features. Hand movement dynamics, represented by temporal features, have to be extracted by analyzing the total fr...
متن کاملCystoscopy Image Classication Using Deep Convolutional Neural Networks
In the past three decades, the use of smart methods in medical diagnostic systems has attractedthe attention of many researchers. However, no smart activity has been provided in the eld ofmedical image processing for diagnosis of bladder cancer through cystoscopy images despite the highprevalence in the world. In this paper, two well-known convolutional neural networks (CNNs) ...
متن کاملCrop Land Change Monitoring Based on Deep Learning Algorithm Using Multi-temporal Hyperspectral Images
Change detection is done with the purpose of analyzing two or more images of a region that has been obtained at different times which is Generally one of the most important applications of satellite imagery is urban development, environmental inspection, agricultural monitoring, hazard assessment, and natural disaster. The purpose of using deep learning algorithms, in particular, convolutional ...
متن کامل